## Power consumption of Wi-Fi transceivers



# Power Consumption of Wi-Fi Transceivers

Wissem Benali and Marie Le Bot Orange Labs Clos Courtel, Cesson Sévigné, France Charlotte Langlais and Sylvie Kerouédan Lab-STICC, Telecom Bretagne, Technopole Brest-Iroise, Brest, France

Abstract— In this paper, the physical layer of the Wi-Fi IEEE 802.11ac standard is analyzed from the energy efficiency point of view. To better assess the energy footprint of Wi-Fi communications, the power dissipated in the circuits to run the signal processing algorithms is taken into account in addition to the classical transmission power at the antenna. Simulations and hardware developments on a Field-Programmable Gate Array board provide a reliable evaluation of the power distribution between transmitter and receiver components. This evaluation enables an identification of the most consuming components, and an analysis of the impact of system parameters on power consumption.

Keywords—power consumption; Wi-Fi; circuits; FPGA

### I. INTRODUCTION

In a context of a rapidly growing demand for multimedia services and the multiplication of devices in the Home, the footprint of information and communication technologies has become a considerably increasing concern. Green networks have raised significant interest in the research community for cellular networks, as in GreenTouch consortium [1] or the European project EARTH [2] for instance, but there has been limited research on Home networks. The European Commission has introduced rules and guidelines to set constraints on the maximum power consumption of home gateways and network devices in [3], defining targets for power consumption, but without specifying how to meet them. The 3 year GREENCoMM project, which ended in April 2016, aimed at evaluating the power consumption of home communication technologies, and proposed solutions to improve the energy efficiency [4]. This paper is based on the work carried out in the GREENCoMM project, and focuses on the Wi-Fi, which is the most widely used technology for multimedia services at home.

Some interesting qualitative studies of the energy efficiency in wireless networks can be found in e.g. [5][6]. In [7], the authors emphasize the importance of the power dissipated in the circuits to run the signal processing algorithms, in addition to the classical transmission power at the antenna. In [8], a programmable IEEE 802.11a chain is proposed, and the complexity of optimized signal processing algorithms is evaluated. In [9], the energy of a particular element, the Fast Fourier Transform (FFT), is analyzed for various architectures. In [10], the impact of the decoder structure on the power consumption is discussed. To complete these studies, this paper introduces a quantitative evaluation of the energy cost of the digital processing elements of the most recent Wi-Fi standards, IEEE 802.11n/ac [11]. The circuit power of each component of

the transmitter and receiver is evaluated thanks to both simulations and hardware measurements. Many numerical values are provided, for several bandwidths and configurations, enabling to weight the impact of some crucial parameters upon power consumption, such as temperature, bit rate, modulation and coding schemes. Although the induced power consumption is relatively small, the power savings presented here must be integrated in a global effort to optimize the different parts of the system, which can finally represent a large gain at the scale of a country.

Our work focuses on the digital baseband part of the physical (PHY) layer, including coding, FFT, interleaving, etc. The analog part is not considered; details on this topic can be found in e.g. [12]. The paper is organized as follows. After a description of the theoretical context in section II, the practical methodology is provided in section III. Measurements and simulation results are analyzed in section IV, and conclusions are drawn in section V.

### II. BACKGROUND

#### A. Power consumption

In a communication system, the overall power cost includes both the transmission power at the antenna, and the circuit power used for signal processing. The transmission power is classically used in link budgets, and is limited by the regulations of the different countries. In Europe the maximum radiated power in the 5 GHz band ranges from 100 to 1000 mW [13]. Up to now only few studies have considered the circuit power; nevertheless, when designing communication systems, it must be taken into account too, especially in indoor environments involving short distances, and thus, small transmission powers.

The circuit power  $P_c$  includes the static and dynamic powers,  $P_{stat}$  and  $P_{dyn}$  resp. [14]. The static power is mainly due to leakage currents, and depends on the circuit technology, the supply voltage and the ambient temperature. Its theoretical expression is (in W):  $P_{stat} = V i_{leak}$ , where  $i_{leak}$  is the leakage current intensity (in A) that increases with the temperature [15]; and V is the supply voltage (in V) of the integrated circuit core

The dynamic power is caused by the switching activity of transistors; its theoretical expression is (in W):  $P_{dyn} = \alpha C V^2 f$ , where  $\alpha$  represents the switching probability of a design element, C is the node capacitance (in Farads), and f is the system clock frequency (in Hz).

## B. Wi-Fi context

The major components of the Wi-Fi digital baseband transmission (Tx) and reception (Rx) chains are represented in Fig. 1. The Wi-Fi IEEE 802.11 standards only specify the transmission part, with several parameter sets (bandwidth, modulations and coding scheme, etc.). The receiver is not standardized, and includes some proprietary algorithms; however, the main features can be assumed. This system is based on the Orthogonal Frequency Division Multiplexing (OFDM), constructed with direct and Inverse Fast Fourier Transform (I)FFT blocks, so that several modulated symbols transmitted simultaneously on different subcarrier frequencies. The framing step consists in building the OFDM symbol, including the data symbols, but also pilot symbols and null symbols. The pilots are mainly used for synchronization and channel estimation; and the null symbols are located in guard bands. A cyclic prefix (CP) is added at the beginning of each OFDM symbol to absorb the intersymbol interference due to multipath channel propagation.



Fig. 1. Block diagram of the baseband PHY layer chains

#### III. METHODOLOGY

## A. Tools

The integrated circuits that make up the Wi-Fi transceivers are generally Application-Specific Integrated Circuits (ASIC) designed for a particular use, and not programmable. For cost and flexibility reasons, another type of integrated circuits is used for experimental tests: the Field-Programmable Gate Array (FPGA) that includes reconfigurable memories and transistors. These circuits are very convenient for testing, but their operation requires more energy.

In this study, we use both real measurements on FPGA board, and simulation tool to evaluate the energy cost of the blocks of the Wi-Fi Tx/Rx chains. A simulation chain has been developed in a Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) to evaluate circuit power based on the theoretical expressions given in section II, via the analyzer tool XPower from Xilinx [16]. The consumption of each block is evaluated by simulation, in order to obtain the distribution of the power consumption within the whole Tx and Rx chains. In parallel, a hardware

implementation has been achieved on a FPGA board to provide real measurements of the power consumption. Fusion tool [17] gathers the current and the voltage of the supply controller via a probe, so that the total circuit power dissipated in the FPGA can be calculated. The static power is measured by setting the frequency clock to 0 via FPGA reset; and the dynamic power is obtained by subtracting the static power from the total power.

#### B. Scenarios

The baseband Tx/Rx chains of the Wi-Fi standard IEEE 802.11ac [11] have been simulated at 5 GHz carrier frequency, for 20 MHz, 40 MHz and 80 MHz bandwidth; for several Modulation and Coding Schemes (MCSs) with one antenna at the transmission and reception sides. The tested MCSs are Quaternary Phase Shift Keying (QPSK) and Quadratic Amplitude Modulation (QAM) with coding rates 1/2, 2/3, 3/4. The OFDM symbol duration is 4  $\mu$ s. The forward error coding scheme is a binary convolutional encoder, and the decoder is typically a Viterbi decoder. For sake of simplicity, the channel estimation and frequency synchronization are assumed to be perfect; and the related blocks are not taken into account. The characteristics of the simulated scenarios as well as the PHY layer bit rates are described in Table I.

It must be noted that with the developed architecture, some blocks cannot support frequency clocks higher than 200 MHz because the critical paths are reached. Therefore the high bit rates scenarios involving MCS 5 and MCS 8 at 80 MHz (in grey background in Table I) have not been simulated.

TABLE I. SIMULATED SCENARIOS

|                             | Bandwidth           | 20 MHz  | 40 MHz   | 80 MHz    |
|-----------------------------|---------------------|---------|----------|-----------|
|                             | FFT size            | 64      | 128      | 256       |
| Number of data subcarriers  |                     | 52      | 108      | 234       |
| CP size (number of samples) |                     | 16      | 32       | 64        |
|                             | MCS 1 (QPSK 1/2)    | 13 Mbps | 27 Mbps  | 58.5 Mbps |
| Data                        | MCS 3 (16-QAM 1/2)  | 26 Mbps | 54 Mbps  | 117 Mbps  |
| rate                        | MCS 5 (64-QAM 2/3)  | 52 Mbps | 108 Mbps | 234 Mbps  |
|                             | MCS 8 (256-QAM 3/4) | 78 Mbps | 162 Mbps | 351 Mbps  |

The FFT block used in this study is a pipeline scaled fixed-point IPCore developed by Xilinx. The decoder is a 35 traceback length Viterbi and uses 8 quantization bits. The FPGA board is a Virtex 7, VC707 from Xilinx [18]. The supply voltage is set to 1 V. A 10-bit precision is used to quantify each component (real and imaginary) of the complex symbols at the IFFT input.

## C. Architecture

The chosen architecture is such that 2 bits are delivered per clock cycle by the coder. Therefore, after modulation (i.e. constellation mapping), each complex symbol needs K/2 clock cycles to be transmitted, K being the number of coded bits per complex symbol. The IFFT continuously receives one complex symbol per cycle to create one OFDM symbol. This leads to 2 architectures of the Tx/Rx chains, depending on the MCS.

In the case of QPSK 1/2 code rate (MCS 1), K = 2. During one clock cycle, one data symbol of the modulation block is transmitted to the OFDM framing block; and at the same time,

the OFDM framing block must send one complex symbol to the IFFT block, including not only data, but also pilots and zero symbols. Therefore, to avoid loosing data, the data flow must be stopped during the pilots and zero symbols insertion. The same reasoning can be done concerning the cyclic prefix insertion after the IFFT computation. The input bit flow is therefore not continuously transmitted, and the frequency clock required to obtain a given bit rate R is equal to:  $f = R(N_{FFT} + N_{CP})/N_{data}$ , where  $N_{FFT}$  is the (I)FFT size,  $N_{CP}$  the CP size, and  $N_{data}$  the number of complex data symbols per OFDM symbol.

For modulations larger than QPSK,  $K \ge 4$ , one data symbol contains more than (or exactly) 4 coded bits and then requires more than (or exactly) 2 clock cycles to be sent to the OFDM framing block. The insertion of each pilot and zero symbol only needs one clock cycle; therefore it can be done during the transmission of data symbols to the framing block. Hence, we observe that the number of cycles needed to generate the framing output,  $N_{data}$  K/2, is higher than the number of cycles needed to build the temporal OFDM symbol, equal to  $(N_{FFT} + N_{CP})$ . Then, in this case, the flow of data bits is continuous, so that the frequency clock equals the bit rate.

#### IV. SIMULATION RESULTS ANALYSIS

## A. Static power

Fig. 2 shows both the static power and dynamic power obtained via simulations, for two Rx chains, one working in the 20 MHz bandwidth at 52 Mbps (MCS 5); and one in the 40 MHz bandwidth, at 162 Mbps (MCS 8). As expected from the theoretical analysis given in section II, it can be seen that the static power increases with the temperature; whereas the dynamic power, which only depends on the implemented design, remains constant over the whole temperature range.



Fig. 2. Dynamic and static power consumptions

This example gives us an order of magnitude of the weight of the static and dynamic parts of the baseband circuit power. The typical temperature in electronic components measured in our testbed being around 48°C, the dynamic power roughly represents 20% to 50% of the total circuit power. It shows that

the dynamic power can exceed the static power. Therefore, in some cases, efforts to reduce dynamic power can allow significant power savings.

## B. Power distribution among the blocks

dynamic power consumptions obtained from simulations are given in Table II, for each block of the transmission and reception chains, the individual blocks being constantly active. The last two columns give the sum of the dynamic powers of all the blocks of Tx and Rx chains. The power required to run scrambling, puncturing, interleaving, mapping and IFFT blocks at the transmission side is the same, respectively, as the power required to run the de-scrambling, de-puncturing, de-interleaving, de-mapping and FFT blocks at the reception side. The framing step has been evaluated within the modulation block. The aim being the evaluation of the power consumption of the transmitter and receiver, the channel has not been emulated. The channel equalizer has not been developed. However, as it should simply consist in a multiplication by the inverse of the estimate of the channel coefficients, it would only add a negligible computational cost to the circuitry. Based on these results, pie charts showing the distribution of the power consumption of each block of the transmission and reception chains are provided in Fig. 3 and Fig. 4, for two MCSs in the 40 MHz bandwidth.



Fig. 3. Distribution of the power consumption – Tx chain



Fig. 4. Distribution of the power consumption – Rx chain

TABLE II. SIMULATION RESULTS - DYNAMIC POWER PER BLOCK OF TX/RX CHAINS [mW]

| Bandwidth | MCS   | scrambler | interleaver | modulation | IFFT | coder | decoder | Sum Tx | Sum Rx |
|-----------|-------|-----------|-------------|------------|------|-------|---------|--------|--------|
| 20 MHz    | MCS 1 | 1         | 2           | 0          | 12   | 0     | 57      | 15     | 72     |
|           | MCS 3 | 1         | 4           | 4          | 16   | 0     | 74      | 25     | 99     |
|           | MCS 5 | 1         | 20          | 7          | 33   | 0     | 147     | 61     | 208    |
|           | MCS 8 | 1         | 29          | 32         | 50   | 1     | 221     | 113    | 333    |
| 40 MHz    | MCS 1 | 1         | 6           | 1          | 30   | 1     | 112     | 39     | 150    |
|           | MCS 3 | 1         | 11          | 12         | 40   | 1     | 152     | 65     | 216    |
|           | MCS 5 | 1         | 67          | 29         | 80   | 2     | 303     | 179    | 480    |
|           | MCS 8 | 2         | 74          | 65         | 119  | 3     | 456     | 263    | 716    |
| 80 MHz    | MCS 1 | 1         | 18          | 3          | 59   | 1     | 226     | 82     | 307    |
|           | MCS 3 | 1         | 70          | 65         | 87   | 2     | 332     | 225    | 555    |

It can be observed that the IFFT block dominates the power budget of the transmission chain. For high bit rates, the consumption of interleaver and modulation blocks noticeably increases. In the reception chain, the Viterbi decoder is the most consuming block, requiring more than half of the total power. To sum up, according to the results obtained in all of the studied scenarios, the coding/decoding represents around 50% of the whole Tx/Rx consumption; the IFFT/FFT around 25%; and the interleaver/de-interleaver, together with the modulation/de-modulation, around 20%. These results clearly show that important power savings can be obtained by optimizing the algorithms used in these blocks.

The blocks are impacted by the system parameters FFT size, constellation order K and coding rate  $\rho$  as follows. The (I)FFT depends on  $N_{FFT}$ ; the modulation/demodulation (including framing) depends on K and  $N_{FFT}$ ; the interleaver/deinterleaver depends on K,  $\rho$  and  $N_{FFT}$ ; the scrambler/descrambler and coder/decoder are not impacted by the system parameters. Therefore, for a constant system parameter set, the dynamic power of each block increases proportionally to the clock frequency.

The power consumptions of IFFT and Viterbi decoder measured from FPGA board are given in Table III.

TABLE III. FPGA MEASUREMENTS – IFFT AND VITERBI DYNAMIC POWER [mW]

| Bandwidth | MCS   | IFFT | Decoder |
|-----------|-------|------|---------|
|           | MCS 1 | 10   | 50      |
| 20        | MCS 3 | 20   | 60      |
| MHz       | MCS 5 | 30   | 120     |
|           | MCS 8 | 50   | 180     |
|           | MCS 1 | 30   | 90      |
| 40        | MCS 3 | 40   | 120     |
| MHz       | MCS 5 | 80   | 220     |
|           | MCS 8 | 110  | 360     |
| 80        | MCS 1 | 60   | 160     |
| MHz       | MCS 3 | 100  | 250     |

By comparing with the figures of Table II, it can be seen that the difference between measurements and simulations is very small, with 10% average error, which is quite acceptable.

It is also worth emphasizing that the step of power measurement on FPGA is 10 mW, which prevents the small values from being accurately measured.

#### C. Tx/Rx chains consumption

Table IV shows the power consumption of the whole Tx and Rx chains, including all blocks together, obtained by simulation. It can be noted from Table II that the consumption of the chains is lower than the sum of the powers of all the individual blocks. This is due to the fact that the figures of Table II correspond to isolated and constantly active blocks, whereas once they are inserted in a chain, these blocks have not the same activity rate. This is also due to the optimization and the sharing of material resources (such as memories and registers) that can be done when several blocks are combined. Indeed, simulations show that the number of generated signals equals 2490 when the blocks are implemented separately, whereas it is 2457 in the global chains. In the same way, the number of logic gates is 1502 with separate blocks, while it decreases down to 1475 in the chains.

TABLE IV. SIMULATION RESULTS - Tx/Rx Chains dynamic power [mW]

| Bandwidth | MCS   | Tx  | Rx  |
|-----------|-------|-----|-----|
|           | MCS 1 | 12  | 49  |
| 20        | MCS 3 | 18  | 95  |
| MHz       | MCS 5 | 39  | 135 |
|           | MCS 8 | 64  | 188 |
|           | MCS 1 | 30  | 97  |
| 40        | MCS 3 | 51  | 151 |
| MHz       | MCS 5 | 171 | 299 |
|           | MCS 8 | 192 | 479 |
| 80        | MCS 1 | 81  | 202 |
| MHz       | MCS 3 | 210 | 439 |

The power consumptions of Tx and Rx chains measured from the FPGA board are given by Fig. 5 and Fig. 6 resp., showing that the power cost increases with the bit rate. It must be emphasized that for a given bandwidth, i.e. a fixed FFT size, the power consumption of almost all blocks increases linearly with the frequency clock, and therefore with the bit rate. The interleaver and the constellation mapping are impacted by the constellation order in a more complex way. This explains the not totally linear aspect of the curves.



Fig. 5. FPGA measurements - Tx chain dynamic power



Fig. 6. FPGA measurements – Rx chain dynamic power

## V. CONCLUSIONS

In the context of Home multimedia services, this paper has described the first steps towards low power Wi-Fi through the evaluation of the power consumption, in various configurations. For reliable and accurate results, this evaluation includes implementation on a hardware platform. Investigations have focused on optimizations of the architecture to reach a judicious design of the physical layer components and to integrate them in the chains. The distribution of the power consumption within the whole transceiver has been evaluated and analyzed. Hence, the greediest blocks identified, such as the decoder and the FFT, can be further optimized in order to reduce their complexity.

A complete methodology has been worked out to evaluate the power consumption of the digital signal processing algorithms, which could be further used to take the energy efficiency into account from the beginning of the design step, for new components. Besides, this methodology can be applied as well to the optional Low Density Parity Check (LDPC)

codes that are also specified in the standard; and a comparison with the convolutional encoder and Viterbi decoder is possible. In the same way, the power consumption of Multiple Input Multiple Output (i.e. multi-antenna) schemes, that are expected to increase the power consumption, could be quantified.

#### ACKNOWLEDGMENT

This work was supported by the 3 year French project FUI14 GREENCoMM that ended in April 2016 [4].

#### REFERENCES

- [1] GreenTouch project, <a href="http://www.greentouch.org/">http://www.greentouch.org/</a>
- [2] EARTH project, https://www.ict-earth.eu/
- [3] Code of conduct on energy consumption of broadband equipment, v5.0, European Commission Joint Research Centre, Institute for Energy and Transport, Renewable Energy Unit, Dec. 2013.
- [4] GREENCoMM project, <a href="http://greencomm-projet.org/">http://greencomm-projet.org/</a>
- [5] Y. Chen, S. Zhang, S. Xu, G.Y. Li, "Fundamental trade-offs on green wireless networks," IEEE Commun. Mag., vol. 49, no. 6, pp. 30-37, June 2011.
- [6] G.Y. Li, Z. Xu, C. Xiong, C. Yang, S. Zhang, et al., "Energy-efficient wireless communications: tutorial, survey, and open issues," IEEE Wireless Commun., vol. 18, no. 6, pp. 28-35, Dec. 2011.
- [7] G. Miao, N. Himayat, Y. Li, and A. Swami, "Cross-layer optimization for energy-efficient wireless communications: a survey," Wiley journal Wireless Communications and Mobile Computing, vol. 9, no. 4, pp. 529-542, Apr. 2009.
- [8] Y. Lin, H. Lee, M. Woh, Y. Harel, S. Mahlke and T. Mudge, "SODA: a low-power architecture for software radio," in Proc. 33rd Int. Symp. Computer Architecture (ISCA'06), Boston, June 2006, pp. 89-101.
- [9] B.H. Calhoun, D.C. Daly, N. Verma, D.F. Finchelstein, D.D. Wentzloff, et al., "Design considerations for ultra-low energy wireless microsensors nodes," IEEE Trans. Comput., vol. 54, no. 6, pp. 727-740, June 2005.
- [10] P. Grover, K. Woyach and A. Sahai, "Towards a communicationtheoretic understanding of system-level power consumption," IEEE J. Sel. Areas Commun., vol. 29, no. 8, pp. 1744-1755, Sept. 2011.
- [11] IEEE Standard for Inform. Technology Telecommun. and inform. exchange between systems - Local and metropolitan area networks -Specific requirements, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, IEEE 802.11ac-2013.
- [12] Y. Li, B. Bakkaloglu and C. Chaktabarti, "A system level energy model and energy-quality evaluation for integrated transceiver front-ends," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, no.1, pp. 90-103, Jan. 2007.
- [13] Broadband Radio Access Networks (BRAN); 5 GHz high performance RLAN; Harmonized EN covering the essential requirements of article 3.2 of the R&TTE Directive, ETSI EN 301 893 v1.7.1, June 2012.
- [14] A. P. Chandrakasan, S. Sheng, R. W. Brodersen, "Low-power CMOS digital design," IEEE J. Solid-State Circuits, vol. 27, no. 4, pp. 473-484, Apr. 1992.
- [15] Y. Liu, R.P. Dick, L. Shang, H. Yang, "Accurate temperature-dependent integrated circuit leakage power estimation is easy," in Proc. Design, Automation & Test in Europe Conference & Exhibition (DATE'07), Nice, Apr. 2007, pp. 1-6.
- [16] Xilinx Power Tools Tutorial, Spartan-6 and Virtex-6 FPGA, v14.4, Xilinx, Dec. 2012.
- [17] Using the UCD92xx Digital Point-of-Load Controller, Design Guide, Texas Instruments, April 2011.
- [18] VC707 Evaluation board for the Virtex-7 FPGA, User guide, v 1.7, Xilinx, March 2016.